{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Research Workflow\n",
    "\n",
    "This notebook demonstrates the research [workflow](https://langchain-ai.github.io/langgraph/tutorials/workflows/) that creates comprehensive reports through a series of focused steps. The system:\n",
    "\n",
    "1. Uses a **graph workflow** with specialized nodes for each report creation stage\n",
    "2. Enables user **feedback and approval** at critical planning points \n",
    "3. Produces a well-structured report with introduction, researched body sections, and conclusion\n",
    "\n",
    "## From repo "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "/Users/rlm/Desktop/Code/open_deep_research/src\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/rlm/Desktop/Code/open_deep_research/open-deep-research-env/lib/python3.11/site-packages/IPython/core/magics/osm.py:417: UserWarning: This is now an optional IPython functionality, setting dhist requires you to install the `pickleshare` library.\n",
      "  self.shell.db['dhist'] = compress_dhist(dhist)[-100:]\n"
     ]
    }
   ],
   "source": [
    "%cd ..\n",
    "%load_ext autoreload\n",
    "%autoreload 2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## From package "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.2.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m25.0.1\u001b[0m\n",
      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "! pip install -U -q open-deep-research"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Compile the Graph-Based Research Workflow\n",
    "\n",
    "The next step is to compile the LangGraph workflow that orchestrates the report creation process. This defines the sequence of operations and decision points in the research pipeline."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Import required modules and initialize the builder from open_deep_research\n",
    "import uuid \n",
    "import os, getpass\n",
    "import open_deep_research   \n",
    "print(open_deep_research.__version__) \n",
    "from IPython.display import Image, display, Markdown\n",
    "from langgraph.types import Command\n",
    "from langgraph.checkpoint.memory import MemorySaver\n",
    "from open_deep_research.graph import builder"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create a memory-based checkpointer and compile the graph\n",
    "# This enables state persistence and tracking throughout the workflow execution\n",
    "\n",
    "memory = MemorySaver()\n",
    "graph = builder.compile(checkpointer=memory)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Visualize the graph structure\n",
    "# This shows the nodes and edges in the research workflow\n",
    "\n",
    "display(Image(graph.get_graph(xray=1).draw_mermaid_png()))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Helper function to set environment variables for API keys\n",
    "# This ensures all necessary credentials are available for various services\n",
    "\n",
    "def _set_env(var: str):\n",
    "    if not os.environ.get(var):\n",
    "        os.environ[var] = getpass.getpass(f\"{var}: \")\n",
    "\n",
    "# Set the API keys used for any model or search tool selections below, such as:\n",
    "_set_env(\"OPENAI_API_KEY\")\n",
    "_set_env(\"ANTHROPIC_API_KEY\")\n",
    "_set_env(\"TAVILY_API_KEY\")\n",
    "_set_env(\"GROQ_API_KEY\")\n",
    "_set_env(\"PERPLEXITY_API_KEY\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define report structure template and configure the research workflow\n",
    "# This sets parameters for models, search tools, and report organization\n",
    "\n",
    "REPORT_STRUCTURE = \"\"\"Use this structure to create a report on the user-provided topic:\n",
    "\n",
    "1. Introduction (no research needed)\n",
    "   - Brief overview of the topic area\n",
    "\n",
    "2. Main Body Sections:\n",
    "   - Each section should focus on a sub-topic of the user-provided topic\n",
    "   \n",
    "3. Conclusion\n",
    "   - Aim for 1 structural element (either a list of table) that distills the main body sections \n",
    "   - Provide a concise summary of the report\"\"\"\n",
    "\n",
    "# Configuration option 1: Claude 3.7 Sonnet for planning with perplexity search\n",
    "thread = {\"configurable\": {\"thread_id\": str(uuid.uuid4()),\n",
    "                           \"search_api\": \"perplexity\",\n",
    "                           \"planner_provider\": \"anthropic\",\n",
    "                           \"planner_model\": \"claude-3-7-sonnet-latest\",\n",
    "                           # \"planner_model_kwargs\": {\"temperature\":0.8}, # if set custom parameters\n",
    "                           \"writer_provider\": \"anthropic\",\n",
    "                           \"writer_model\": \"claude-3-5-sonnet-latest\",\n",
    "                           # \"writer_model_kwargs\": {\"temperature\":0.8}, # if set custom parameters\n",
    "                           \"max_search_depth\": 2,\n",
    "                           \"report_structure\": REPORT_STRUCTURE,\n",
    "                           }}\n",
    "\n",
    "# Configuration option 2: DeepSeek-R1-Distill-Llama-70B for planning and llama-3.3-70b-versatile for writing\n",
    "thread = {\"configurable\": {\"thread_id\": str(uuid.uuid4()),\n",
    "                           \"search_api\": \"tavily\",\n",
    "                           \"planner_provider\": \"groq\",\n",
    "                           \"planner_model\": \"deepseek-r1-distill-llama-70b\",\n",
    "                           \"writer_provider\": \"groq\",\n",
    "                           \"writer_model\": \"llama-3.3-70b-versatile\",\n",
    "                           \"report_structure\": REPORT_STRUCTURE,\n",
    "                           \"max_search_depth\": 1,}\n",
    "                           }\n",
    "\n",
    "# Configuration option 3: Use OpenAI o3 for both planning and writing (selected option)\n",
    "thread = {\"configurable\": {\"thread_id\": str(uuid.uuid4()),\n",
    "                           \"search_api\": \"tavily\",\n",
    "                           \"planner_provider\": \"openai\",\n",
    "                           \"planner_model\": \"o3\",\n",
    "                           \"writer_provider\": \"openai\",\n",
    "                           \"writer_model\": \"o3\",\n",
    "                           \"max_search_depth\": 2,\n",
    "                           \"report_structure\": REPORT_STRUCTURE,\n",
    "                           }}\n",
    "\n",
    "# Define research topic about Model Context Protocol\n",
    "topic = \"Overview of Model Context Protocol (MCP), an Anthropic‑backed open standard for integrating external context and tools with LLMs. Give an architectural overview for developers, tell me about interesting MCP servers, and compare to google Agent2Agent (A2A) protocol.\"\n",
    "\n",
    "# Run the graph workflow until first interruption (waiting for user feedback)\n",
    "async for event in graph.astream({\"topic\":topic,}, thread, stream_mode=\"updates\"):\n",
    "    if '__interrupt__' in event:\n",
    "        interrupt_value = event['__interrupt__'][0].value\n",
    "        display(Markdown(interrupt_value))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# User Feedback Phase\n",
    "\n",
    "* This allows for providing directed feedback on the initial report plan\n",
    "* The user can review the proposed report structure and provide specific guidance\n",
    "* The system will incorporate this feedback into the final report plan"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Submit feedback on the report plan\n",
    "# The system will continue execution with the updated requirements\n",
    "\n",
    "# Provide specific feedback to focus and refine the report structure\n",
    "async for event in graph.astream(Command(resume=\"Looks great! Just do one section related to Agent2Agent (A2A) protocol, introducing it and comparing to MCP.\"), thread, stream_mode=\"updates\"):\n",
    "    if '__interrupt__' in event:\n",
    "        interrupt_value = event['__interrupt__'][0].value\n",
    "        display(Markdown(interrupt_value))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Final Approval Phase\n",
    "* After incorporating feedback, approve the plan to start content generation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Approve the final plan and execute the report generation\n",
    "# This triggers the research and writing phases for all sections\n",
    "\n",
    "# The system will now:\n",
    "# 1. Research each section topic\n",
    "# 2. Generate content with citations\n",
    "# 3. Create introduction and conclusion\n",
    "# 4. Compile the final report\n",
    "\n",
    "async for event in graph.astream(Command(resume=True), thread, stream_mode=\"updates\"):\n",
    "    print(event)\n",
    "    print(\"\\n\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "# Introduction  \n",
       "Large language models excel at reasoning, but without structured access to the outside world they remain isolated. The Model Context Protocol (MCP) bridges this gap, defining an open, vendor‑neutral way for models to tap files, databases, APIs, and other tools through simple JSON‑RPC exchanges. This report walks developers through the protocol’s architecture, surveys real‑world MCP servers that showcase its flexibility, and contrasts MCP with Google’s emerging Agent‑to‑Agent (A2A) standard. By the end, you should know when, why, and how to weave MCP into your own agentic systems.\n",
       "\n",
       "## MCP Architectural Overview for Developers\n",
       "\n",
       "MCP uses a client‑host‑server model: a host process spawns isolated clients, and every client keeps a 1‑to‑1, stateful session with a single server that exposes prompts, resources, and tools through JSON‑RPC 2.0 messages [1][5].  \n",
       "\n",
       "A session passes through three phases — initialize, operation, shutdown. The client begins with an initialize request that lists its protocolVersion and capabilities; the server replies with a compatible version and its own capabilities. After the client’s initialized notification, both sides may exchange requests, responses, or one‑way notifications under the agreed capabilities [2].  \n",
       "\n",
       "Two official transports exist. Stdio is ideal for local child processes, while HTTP (SSE/“streamable HTTP”) supports multi‑client, remote scenarios. Both must preserve JSON‑RPC framing, and servers should validate Origin headers, bind to localhost where possible, and apply TLS or authentication to block DNS‑rebind or similar attacks [1][3].  \n",
       "\n",
       "To integrate MCP, developers can:  \n",
       "1) implement a server that registers needed primitives and advertises them in initialize.result.capabilities;  \n",
       "2) validate all inputs and set reasonable timeouts;  \n",
       "3) or consume existing servers via SDKs—select a transport, send initialize, then invoke or subscribe to tools/resources exactly as negotiated [4][5].  \n",
       "\n",
       "### Sources  \n",
       "[1] MCP Protocol Specification: https://www.claudemcp.com/specification  \n",
       "[2] Lifecycle – Model Context Protocol: https://modelcontextprotocol.info/specification/draft/basic/lifecycle/  \n",
       "[3] Transports – Model Context Protocol: https://modelcontextprotocol.io/specification/2025-03-26/basic/transports  \n",
       "[4] Core Architecture – Model Context Protocol: https://modelcontextprotocol.io/docs/concepts/architecture  \n",
       "[5] Architecture – Model Context Protocol Specification: https://spec.modelcontextprotocol.io/specification/2025-03-26/architecture/\n",
       "\n",
       "## Ecosystem Spotlight: Notable MCP Servers\n",
       "\n",
       "Hundreds of MCP servers now exist, spanning core data access, commercial platforms, and hobby projects—proof that the protocol can wrap almost any tool or API [1][2].\n",
       "\n",
       "Reference servers maintained by Anthropic demonstrate the basics.  Filesystem, PostgreSQL, Git, and Slack servers cover file I/O, SQL queries, repository ops, and chat workflows.  Developers can launch them in seconds with commands like  \n",
       "`npx -y @modelcontextprotocol/server-filesystem` (TypeScript) or `uvx mcp-server-git` (Python) and then point any MCP‑aware client, such as Claude Desktop, at the spawned process [1].\n",
       "\n",
       "Platform vendors are adding “first‑party” connectors.  Microsoft cites the GitHub MCP Server and a Playwright browser‑automation server as popular examples that let C# or .NET apps drive code reviews or end‑to‑end tests through a uniform interface [3].  Other partner servers—e.g., Cloudflare for edge resources or Stripe for payments—expose full product APIs while still enforcing user approval through MCP’s tool‑calling flow [2].\n",
       "\n",
       "Community builders rapidly fill remaining gaps.  Docker and Kubernetes servers give agents controlled shell access; Snowflake, Neon, and Qdrant handle cloud databases; Todoist and Obsidian servers tackle personal productivity.  Because every server follows the same JSON‑RPC schema and ships as a small CLI, developers can fork an existing TypeScript or Python implementation and swap in their own SDK calls to create new connectors in hours, not weeks [2].  \n",
       "\n",
       "### Sources  \n",
       "[1] Example Servers – Model Context Protocol: https://modelcontextprotocol.io/examples  \n",
       "[2] Model Context Protocol Servers Repository: https://github.com/madhukarkumar/anthropic-mcp-servers  \n",
       "[3] Microsoft partners with Anthropic to create official C# SDK for Model Context Protocol: https://devblogs.microsoft.com/blog/microsoft-partners-with-anthropic-to-create-official-c-sdk-for-model-context-protocol\n",
       "\n",
       "## Agent‑to‑Agent (A2A) Protocol and Comparison with MCP  \n",
       "\n",
       "Google’s Agent‑to‑Agent (A2A) protocol, announced in April 2025, gives autonomous agents a common way to talk directly across vendors and clouds [2]. Its goal is to let one “client” agent delegate work to a “remote” agent without sharing internal code or memory, enabling true multi‑agent systems.  \n",
       "\n",
       "Discovery starts with a JSON Agent Card served at /.well‑known/agent.json, which lists version, skills and endpoints [3]. After discovery, the client opens a Task—an atomic unit that moves through states and exchanges Messages and multimodal Artifacts. HTTP request/response, Server‑Sent Events, or push notifications are chosen based on task length to stream progress safely [2].  \n",
       "\n",
       "Anthropic’s Model Context Protocol (MCP) tackles a different layer: it links a single language model to external tools and data through a Host‑Client‑Server triad, exposing Resources, Tools and Prompts over JSON‑RPC [1]. Communication is model‑to‑tool, not agent‑to‑agent.  \n",
       "\n",
       "Google therefore calls A2A “complementary” to MCP: use MCP to give each agent the data and actions it needs; use A2A to let those empowered agents discover one another, coordinate plans and exchange results [1]. In practice, developers might pipe an A2A task that, mid‑flow, invokes an MCP tool or serve an MCP connector as an A2A remote agent, showing the standards can interlock instead of compete.  \n",
       "\n",
       "### Sources  \n",
       "[1] MCP vs A2A: Comprehensive Comparison of AI Agent Protocols: https://www.toolworthy.ai/blog/mcp-vs-a2a-protocol-comparison  \n",
       "[2] Google A2A vs MCP: The New Protocol Standard Developers Need to Know: https://www.trickle.so/blog/google-a2a-vs-mcp  \n",
       "[3] A2A vs MCP: Comparing AI Standards for Agent Interoperability: https://www.ikangai.com/a2a-vs-mcp-ai-standards/\n",
       "\n",
       "## Conclusion\n",
       "\n",
       "Model Context Protocol (MCP) secures a model’s immediate tool belt, while Google’s Agent‑to‑Agent (A2A) protocol enables those empowered agents to find and hire one another. Their scopes differ but interlock, giving developers a layered recipe for robust, multi‑agent applications.\n",
       "\n",
       "| Aspect | MCP | A2A |\n",
       "| --- | --- | --- |\n",
       "| Layer | Model‑to‑tool RPC | Agent‑to‑agent orchestration |\n",
       "| Session start | `initialize` handshake | Task creation lifecycle |\n",
       "| Discovery | Client‑supplied server URI | `/.well‑known/agent.json` card |\n",
       "| Streaming | Stdio or HTTP/SSE | HTTP, SSE, or push |\n",
       "| Best fit | Embed filesystems, DBs, SaaS APIs into one agent | Delegate subtasks across clouds or vendors |\n",
       "\n",
       "Next steps: prototype an A2A task that internally calls an MCP PostgreSQL server; harden both layers with TLS and capability scoping; finally, contribute a new open‑source MCP connector to accelerate community adoption."
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Display the final generated report\n",
    "# Retrieve the completed report from the graph's state and format it for display\n",
    "\n",
    "final_state = graph.get_state(thread)\n",
    "report = final_state.values.get('final_report')\n",
    "Markdown(report)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Trace: \n",
    "\n",
    "> Note: uses 80k tokens \n",
    "\n",
    "https://smith.langchain.com/public/31eca7c9-beae-42a3-bef4-5bce9488d7be/r"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "open-deep-research-env",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
